Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 16 de 16
Filtrar
1.
Stat Methods Med Res ; 33(5): 875-893, 2024 May.
Artículo en Inglés | MEDLINE | ID: mdl-38502023

RESUMEN

The empirical likelihood is a powerful nonparametric tool, that emulates its parametric counterpart-the parametric likelihood-preserving many of its large-sample properties. This article tackles the problem of assessing the discriminatory power of three-class diagnostic tests from an empirical likelihood perspective. In particular, we concentrate on interval estimation in a three-class receiver operating characteristic analysis, where a variety of inferential tasks could be of interest. We present novel theoretical results and tailored techniques studied to efficiently solve some of such tasks. Extensive simulation experiments are provided in a supporting role, with our novel proposals compared to existing competitors, when possible. It emerges that our new proposals are extremely flexible, being able to compete with contestants and appearing suited to accommodating several distributions, such, for example, mixtures, for target populations. We illustrate the application of the novel proposals with a real data example. The article ends with a discussion and a presentation of some directions for future research.


Asunto(s)
Curva ROC , Funciones de Verosimilitud , Humanos , Pruebas Diagnósticas de Rutina/estadística & datos numéricos , Modelos Estadísticos , Simulación por Computador
2.
Stat Probab Lett ; 1932023 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-38584807

RESUMEN

This work defines a new correction for the likelihood ratio test for a two-sample problem within the multivariate normal context. This correction applies to decomposable graphical models, where testing equality of distributions can be decomposed into lower dimensional problems.

3.
Stat Methods Med Res ; 31(7): 1325-1341, 2022 07.
Artículo en Inglés | MEDLINE | ID: mdl-35360997

RESUMEN

Statistical evaluation of diagnostic tests, and, more generally, of biomarkers, is a constantly developing field, in which complexity of the assessment increases with the complexity of the design under which data are collected. One particularly prevalent type of data is clustered data, where individual units are naturally nested into clusters. In these cases, Bias can arise from omission, in the evaluation process, of cluster-level effects and/or individual covariates. Focusing on the three-class case and for continuous-valued diagnostic tests, we investigate how to exploit the clustered structure of data within a linear-mixed model approach, both when the assumption of normality holds and when it does not. We provide a method for the estimation of covariate-specific receiver operating characteristic surfaces and discuss methods for the choice of optimal thresholds, proposing three possible estimators. A proof of consistency and asymptotic normality of the proposed threshold estimators is given. All considered methods are evaluated by extensive simulation experiments. As an application, we study the use of the Lysosomal Associated Membrane Protein Family Member 5 gene expression as a biomarker to distinguish among three types of glutamatergic neurons.


Asunto(s)
Modelos Estadísticos , Sesgo , Biomarcadores , Simulación por Computador , Modelos Lineales , Selección de Paciente , Curva ROC
4.
Stat Methods Med Res ; 30(2): 349-353, 2021 02.
Artículo en Inglés | MEDLINE | ID: mdl-33779396

RESUMEN

We comment here on a recent paper in this journal, on a non-monotone transformation of biomarkers aimed at improving diagnostic accuracy. We highlight that, in a binary classification problem, the proposed transformation finds its motivation in the Neyman-Pearson lemma, so that the underlying approach is very general and it is applicable to many parametric families, other than the normal one.


Asunto(s)
Curva ROC , Biomarcadores , Humanos
5.
Spinal Cord ; 59(5): 538-546, 2021 May.
Artículo en Inglés | MEDLINE | ID: mdl-32681119

RESUMEN

STUDY DESIGN: Prospective cohort study. OBJECTIVES: To analyze the circadian rhythm and state-dependent modulation of core body temperature (Tcore) in individuals with spinal cord injury (SCI) under controlled environmental conditions. SETTING: Institute of the Neurological Sciences of Bologna, Italy. METHODS: We assessed 48-h rectal Tcore and sleep-wake cycle by means of video-polygraphic recording in five cervical SCI (cSCI), seven thoracic SCI (tSCI), and seven healthy controls under controlled environmental conditions. RESULTS: cSCI showed higher night-time Tcore values with reduced nocturnal decrease, higher MESOR and earlier acrophase compared with tSCI and controls (p < 0.05 in all comparisons). The mean Tcore values during wake and non-rapid eye movement (NREM) and rapid eye movement (REM) sleep stages were higher in cSCI compared with tSCI and controls (p < 0.05). Tcore variability throughout the 24 h differed significantly between cSCI, tSCI, and controls. CONCLUSIONS: cSCI had higher Tcore values without physiological night-time fall compared with controls and tSCI, and a disrupted Tcore circadian rhythm. Furthermore, SCI individuals did not display the physiological state-dependent Tcore modulation. The disconnection of the sympathetic nervous system from its central control caused by the SCI could affect thermoregulation including Tcore modulation during sleep. It is also possible that the reduced representation of deep sleep in people with SCI impairs such ability. Further studies are necessary to evaluate whether improvement of sleep could ameliorate thermoregulation and vice versa.


Asunto(s)
Temperatura Corporal , Traumatismos de la Médula Espinal , Ritmo Circadiano , Humanos , Estudios Prospectivos , Sueño , Traumatismos de la Médula Espinal/complicaciones
6.
PLoS Comput Biol ; 15(10): e1007357, 2019 10.
Artículo en Inglés | MEDLINE | ID: mdl-31652275

RESUMEN

Topological gene-set analysis has emerged as a powerful means for omic data interpretation. Although numerous methods for identifying dysregulated genes have been proposed, few of them aim to distinguish genes that are the real source of perturbation from those that merely respond to the signal dysregulation. Here, we propose a new method, called SourceSet, able to distinguish between the primary and the secondary dysregulation within a Gaussian graphical model context. The proposed method compares gene expression profiles in the control and in the perturbed condition and detects the differences in both the mean and the covariance parameters with a series of likelihood ratio tests. The resulting evidence is used to infer the primary and the secondary set, i.e. the genes responsible for the primary dysregulation, and the genes affected by the perturbation through network propagation. The proposed method demonstrates high specificity and sensitivity in different simulated scenarios and on several real biological case studies. In order to fit into the more traditional pathway analysis framework, SourceSet R package also extends the analysis from a single to multiple pathways and provides several graphical outputs, including Cytoscape visualization to browse the results.


Asunto(s)
Biología Computacional/métodos , Perfilación de la Expresión Génica/métodos , Algoritmos , Redes Reguladoras de Genes/genética , Humanos , Modelos Teóricos , Distribución Normal , Sensibilidad y Especificidad , Programas Informáticos , Transcriptoma/genética
7.
Nucleic Acids Res ; 47(14): e80, 2019 08 22.
Artículo en Inglés | MEDLINE | ID: mdl-31049575

RESUMEN

Survival analyses of gene expression data has been a useful and widely used approach in clinical applications. But, in complex diseases, such as cancer, the identification of survival-associated cell processes - rather than single genes - provides more informative results because the efficacy of survival prediction increases when multiple prognostic features are combined to enlarge the possibility of having druggable targets. Moreover, genome-wide screening in molecular medicine has rapidly grown, providing not only gene expression but also multi-omic measurements such as DNA mutations, methylation, expression, and copy number data. In cancer, virtually all these aberrations can contribute in synergy to pathological processes, and their measurements can improve a patient's outcome and help in diagnosis and treatment decisions. Here, we present MOSClip, an R package implementing a new topological pathway analysis tool able to integrate multi-omic data and look for survival-associated gene modules. MOSClip tests the survival association of dimensionality-reduced multi-omic data using multivariate models, providing graphical devices for management, browsing and interpretation of results. Using simulated data we evaluated MOSClip performance in terms of false positives and false negatives in different settings, while the TCGA ovarian cancer dataset is used as a case study to highlight MOSClip's potential.


Asunto(s)
Biología Computacional/métodos , Variaciones en el Número de Copia de ADN , Metilación de ADN , Perfilación de la Expresión Génica/métodos , Mutación , Neoplasias/genética , Algoritmos , Redes Reguladoras de Genes , Humanos , Estimación de Kaplan-Meier , Neoplasias/diagnóstico , Neoplasias/terapia , Reproducibilidad de los Resultados , Transducción de Señal/genética
8.
Bioinformatics ; 33(3): 456-457, 2017 02 01.
Artículo en Inglés | MEDLINE | ID: mdl-28172414

RESUMEN

Summary: In the omic era, one of the main aims is to discover groups of functionally related genes that drive the difference between different conditions. To this end, a plethora of potentially useful multivariate statistical approaches has been proposed, but their evaluation is hindered by the absence of a gold standard. Here, we propose a method for simulating biological data ­ gene expression, RPKM/FPKM or protein abundances ­ from two conditions, namely, a reference condition and a perturbation of it. Our approach is built upon probabilistic graphical models and is thus especially suited for testing topological approaches. Availability and Implementation: The simPATHy is an R package, it is open source and freely available on CRAN. Contacts: elisa.salviato.2@studenti.unipd.it or chiara.romualdi@unipd.it Supplementary Information: Supplementary data are available at Bioinformatics online.


Asunto(s)
Biología Computacional/métodos , Regulación de la Expresión Génica , Modelos Biológicos , Modelos Estadísticos , Programas Informáticos , Simulación por Computador
9.
Biom J ; 57(5): 852-66, 2015 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-26149206

RESUMEN

Current demand for understanding the behavior of groups of related genes, combined with the greater availability of data, has led to an increased focus on statistical methods in gene set analysis. In this paper, we aim to perform a critical appraisal of the methodology based on graphical models developed in Massa et al. (2010) that uses pathway signaling networks as a starting point to develop statistically sound procedures for gene set analysis. We pay attention to the potential of the methodology with respect to the organizational aspects of dealing with such complex but highly informative starting structures, that is pathways. We focus on three themes: the translation of a biological pathway into a graph suitable for modeling, the role of shrinkage when more genes than samples are obtained, the evaluation of respondence of the statistical models to the biological expectations. To study the impact of shrinkage, two simulation studies will be run. To evaluate the biological expectation we will use data from a network with known behavior that offer the possibility of carrying out a realistic check of respondence of the model to changes in the experimental conditions.


Asunto(s)
Biometría/métodos , Gráficos por Computador , Modelos Estadísticos , Transcriptoma , Algoritmos , Humanos , Leucemia Mielógena Crónica BCR-ABL Positiva/genética , Leucemia Mielógena Crónica BCR-ABL Positiva/patología , Masculino , Neoplasias de la Próstata/genética , Neoplasias de la Próstata/patología , Transducción de Señal
10.
Int J Biostat ; 11(1): 109-24, 2015 May.
Artículo en Inglés | MEDLINE | ID: mdl-25781712

RESUMEN

For a continuous-scale diagnostic test, the receiver operating characteristic (ROC) curve is a popular tool for displaying the ability of the test to discriminate between healthy and diseased subjects. In some studies, verification of the true disease status is performed only for a subset of subjects, possibly depending on the test result and other characteristics of the subjects. Estimators of the ROC curve based only on this subset of subjects are typically biased; this is known as verification bias. Methods have been proposed to correct verification bias, in particular under the assumption that the true disease status, if missing, is missing at random (MAR). MAR assumption means that the probability of missingness depends on the true disease status only through the test result and observed covariate information. However, the existing methods require parametric models for the (conditional) probability of disease and/or the (conditional) probability of verification, and hence are subject to model misspecification: a wrong specification of such parametric models can affect the behavior of the estimators, which can be inconsistent. To avoid misspecification problems, in this paper we propose a fully nonparametric method for the estimation of the ROC curve of a continuous test under verification bias. The method is based on nearest-neighbor imputation and adopts generic smooth regression models for both the probability that a subject is diseased and the probability that it is verified. Simulation experiments and an illustrative example show the usefulness of the new method. Variance estimation is also discussed.


Asunto(s)
Sesgo , Curva ROC , Estadísticas no Paramétricas , Neoplasias de la Mama/diagnóstico , Humanos
11.
BMC Bioinformatics ; 15 Suppl 5: S3, 2014.
Artículo en Inglés | MEDLINE | ID: mdl-25077979

RESUMEN

BACKGROUND: Time-course gene expression experiments are useful tools for exploring biological processes. In this type of experiments, gene expression changes are monitored along time. Unfortunately, replication of time series is still costly and usually long time course do not have replicates. Many approaches have been proposed to deal with this data structure, but none of them in the field of pathway analysis. Pathway analyses have acquired great relevance for helping the interpretation of gene expression data. Several methods have been proposed to this aim: from the classical enrichment to the more complex topological analysis that gains power from the topology of the pathway. None of them were devised to identify temporal variations in time course data. RESULTS: Here we present timeClip, a topology based pathway analysis specifically tailored to long time series without replicates. timeClip combines dimension reduction techniques and graph decomposition theory to explore and identify the portion of pathways that is most time-dependent. In the first step, timeClip selects the time-dependent pathways; in the second step, the most time dependent portions of these pathways are highlighted. We used timeClip on simulated data and on a benchmark dataset regarding mouse muscle regeneration model. Our approach shows good performance on different simulated settings. On the real dataset, we identify 76 time-dependent pathways, most of which known to be involved in the regeneration process. Focusing on the 'mTOR signaling pathway' we highlight the timing of key processes of the muscle regeneration: from the early pathway activation through growth factor signals to the late burst of protein production needed for the fiber regeneration. CONCLUSIONS: timeClip represents a new improvement in the field of time-dependent pathway analysis. It allows to isolate and dissect pathways characterized by time-dependent components. Furthermore, using timeClip on a mouse muscle regeneration dataset we were able to characterize the process of muscle fiber regeneration with its correct timing.


Asunto(s)
Perfilación de la Expresión Génica/métodos , Músculos/metabolismo , Transducción de Señal , Animales , Humanos , Ratones , Regeneración , Serina-Treonina Quinasas TOR/genética , Serina-Treonina Quinasas TOR/metabolismo
12.
Nucleic Acids Res ; 41(1): e19, 2013 Jan 07.
Artículo en Inglés | MEDLINE | ID: mdl-23002139

RESUMEN

Gene set analysis using biological pathways has become a widely used statistical approach for gene expression analysis. A biological pathway can be represented through a graph where genes and their interactions are, respectively, nodes and edges of the graph. From a biological point of view only some portions of a pathway are expected to be altered; however, few methods using pathway topology have been proposed and none of them tries to identify the signal paths, within a pathway, mostly involved in the biological problem. Here, we present a novel algorithm for pathway analysis clipper, that tries to fill in this gap. clipper implements a two-step empirical approach based on the exploitation of graph decomposition into a junction tree to reconstruct the most relevant signal path. In the first step clipper selects significant pathways according to statistical tests on the means and the concentration matrices of the graphs derived from pathway topologies. Then, it identifies within these pathways the signal paths having the greatest association with a specific phenotype. We test our approach on simulated and two real expression datasets. Our results demonstrate the efficacy of clipper in the identification of signal transduction paths totally coherent with the biological problem.


Asunto(s)
Algoritmos , Transducción de Señal/genética , Transcriptoma , Simulación por Computador , Humanos , Leucemia Mielógena Crónica BCR-ABL Positiva/genética , Leucemia Mielógena Crónica BCR-ABL Positiva/metabolismo , Distrofia Muscular de Cinturas/genética , Distrofia Muscular de Cinturas/metabolismo , Análisis de Secuencia por Matrices de Oligonucleótidos , Leucemia-Linfoma Linfoblástico de Células Precursoras/genética , Leucemia-Linfoma Linfoblástico de Células Precursoras/metabolismo
13.
BMC Syst Biol ; 4: 121, 2010 Sep 01.
Artículo en Inglés | MEDLINE | ID: mdl-20809931

RESUMEN

BACKGROUND: Recently, a great effort in microarray data analysis is directed towards the study of the so-called gene sets. A gene set is defined by genes that are, somehow, functionally related. For example, genes appearing in a known biological pathway naturally define a gene set. The gene sets are usually identified from a priori biological knowledge. Nowadays, many bioinformatics resources store such kind of knowledge (see, for example, the Kyoto Encyclopedia of Genes and Genomes, among others). Although pathways maps carry important information about the structure of correlation among genes that should not be neglected, the currently available multivariate methods for gene set analysis do not fully exploit it. RESULTS: We propose a novel gene set analysis specifically designed for gene sets defined by pathways. Such analysis, based on graphical models, explicitly incorporates the dependence structure among genes highlighted by the topology of pathways. The analysis is designed to be used for overall surveillance of changes in a pathway in different experimental conditions. In fact, under different circumstances, not only the expression of the genes in a pathway, but also the strength of their relations may change. The methods resulting from the proposal allow both to test for variations in the strength of the links, and to properly account for heteroschedasticity in the usual tests for differential expression. CONCLUSIONS: The use of graphical models allows a deeper look at the components of the pathway that can be tested separately and compared marginally. In this way it is possible to test single components of the pathway and highlight only those involved in its deregulation.


Asunto(s)
Biología Computacional/métodos , Modelos Genéticos , Transducción de Señal , Animales , Gráficos por Computador , Receptores ErbB/genética , Receptores ErbB/metabolismo , Perfilación de la Expresión Génica , Humanos , Ratones , Receptores de Antígenos de Linfocitos B/genética , Receptores de Antígenos de Linfocitos B/metabolismo
14.
Int J Biostat ; 6(1): Article 24, 2010.
Artículo en Inglés | MEDLINE | ID: mdl-21969980

RESUMEN

The evaluation of the ability of a diagnostic test to separate diseased subjects from non-diseased subjects is a crucial issue in modern medicine. The accuracy of a continuous-scale test at a chosen cut-off level can be measured by its sensitivity and specificity, i.e. by the probabilities that the test correctly identifies the diseased and non-diseased subjects, respectively. In practice, sensitivity and specificity of the test are unknown. Moreover, which cut-off level to use is also generally unknown in that no preliminary indications driving its choice could be available. In this paper, we address the problem of making joint inference on pairs of quantities defining accuracy of a diagnostic test, in particular, when one of the two quantities is the cut-off level. We propose a technique based on an empirical likelihood statistic that allows, within a unified framework, to build bivariate confidence regions for the pair (sensitivity, cut-off level) at a fixed value of specificity as well as for the pair (specificity, cut-off level) at a fixed value of sensitivity or the pair (sensitivity, specificity) at a fixed cut-off value. A simulation study is carried out to assess the finite-sample accuracy of the method. Moreover, we apply the method to two real examples.


Asunto(s)
Intervalos de Confianza , Pruebas Diagnósticas de Rutina/estadística & datos numéricos , Estudios de Evaluación como Asunto , Funciones de Verosimilitud , Estadísticas no Paramétricas , Estudios de Casos y Controles , Interpretación Estadística de Datos , Análisis de Elementos Finitos , Humanos , Sensibilidad y Especificidad
15.
Bioinformatics ; 25(20): 2685-91, 2009 Oct 15.
Artículo en Inglés | MEDLINE | ID: mdl-19628505

RESUMEN

MOTIVATION: Microarray normalization is a fundamental step in removing systematic bias and noise variability caused by technical and experimental artefacts. Several approaches, suitable for large-scale genome arrays, have been proposed and shown to be effective in the reduction of systematic errors. Most of these methodologies are based on specific assumptions that are reasonable for whole-genome arrays, but possibly unsuitable for small microRNA (miRNA) platforms. In this work, we propose a novel normalization (loessM), and we investigate, through simulated and real datasets, the influence that normalizations for two-colour miRNA arrays have on the identification of differentially expressed genes. RESULTS: We show that normalizations usually applied to large-scale arrays, in several cases, modify the actual structure of miRNA data, leading to large portions of false positives and false negatives. Nevertheless, loessM is able to outperform other techniques in most experimental scenarios. Moreover, when usual assumptions on differential expression distribution are missed, channel effect has a strikingly negative influence on small arrays, bias that cannot be removed by normalizations but rather by an appropriate experimental design. We find that the combination of loessM with eCADS, an experimental design based on biological replicates dye-swap recently proposed for channel-effect reduction, gives better results in most of the experimental conditions in terms of specificity/sensitivity both on simulated and real data. AVAILABILITY: LoessM R function is freely available at http://gefu.cribi.unipd.it/papers/miRNA-simulation/


Asunto(s)
Biología Computacional/métodos , Perfilación de la Expresión Génica/métodos , MicroARNs/química , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Algoritmos , Bases de Datos Genéticas , Almacenamiento y Recuperación de la Información/métodos
16.
BMC Bioinformatics ; 10: 61, 2009 Feb 13.
Artículo en Inglés | MEDLINE | ID: mdl-19216778

RESUMEN

BACKGROUND: Various normalisation techniques have been developed in the context of microarray analysis to try to correct expression measurements for experimental bias and random fluctuations. Major techniques include: total intensity normalisation; intensity dependent normalisation; and variance stabilising normalisation. The aim of this paper is to discuss the impact of normalisation techniques for two-channel array technology on the process of identification of differentially expressed genes. RESULTS: Through three precise simulation plans, we quantify the impact of normalisations: (a) on the sensitivity and specificity of a specified test statistic for the identification of deregulated genes, (b) on the gene ranking induced by the statistic. CONCLUSION: Although we found a limited difference of sensitivities and specificities for the test after each normalisation, the study highlights a strong impact in terms of gene ranking agreement, resulting in different levels of agreement between competing normalisations. However, we show that the combination of two normalisations, such as glog and lowess, that handle different aspects of microarray data, is able to outperform other individual techniques.


Asunto(s)
Simulación por Computador , Perfilación de la Expresión Génica/métodos , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Sensibilidad y Especificidad
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...